A study on automatic detection of Japanese vowel devoicing for speech synthesis
نویسندگان
چکیده
In corpus-based speech synthesis, the quality of the synthetic speech critically depends on the speech corpus. Since the high vowel in Japanese might be devoiced in the real speech, we should detect and transcribe them automatically in the corpus construction. In this paper, we apply the HMM-based method, and adopt two kinds of likelihood differences as voicing measures for different focuses. To improve the detection performance, the discriminative training is applied to voiced/ devoiced HMM training. Moreover, some features that can discriminate the voiced/devoiced units, including duration, energy and autocorrelation, are incorporated together with the likelihood differences in several methods. The experiments show different results for each high vowel, i.e. the devoicing is vowel dependent. For the vowel /i/, the discriminative training can improve the detection performance to a certain degree. And by cumulating the voicing features and the likelihood differences with optimized weights, the detection accuracy is improved. But for the vowel /u/, there is very limited improvement, even with the voicing features.
منابع مشابه
Vowel devoicing vs. mora-timed rhythm in spontaneous Japanese - inspection of phonetic labels of OGI_TS
This paper discusses two well-known phonetic-phonological phenomena of Japanese, vowel devoicing and mora-timed rhythm. Regarding vowel devoicing as a process of vowels getting consonantal, it may change the mora-templated syllable structures of Japanese, resulting in the deviation from mora-timed rhythm. To investigate spontaneous Japanese with comparison to other languages, the label data of ...
متن کاملPhonetic imitation of Japanese vowel devoicing
Recent studies have shown that talkers implicitly imitate/accommodate the phonetic properties of recently heard speech [1, 2]. However, it has also been shown that this phonetic imitation effect is not an automatic process [3, 4]: in [3], the artificially lengthened VOT on /p/ was imitated in a non-shadowing task, while shortened VOT (which could jeopardize phonemic contrast) was not imitated, ...
متن کاملGemination of Consonant in Spontaneous Speech: An Analysis of the "Corpus of Spontaneous Japanese"
In Japanese, there is frequent alternation between CV morae and moraic geminate consonants. In this study, we analyzed the phonemic environments of consonant gemination (CG) using the “Corpus of Spontaneous Japanese (CSJ).” The results revealed that the environment in which gemination occurs is, to some extent, parallel to that of vowel devoicing. However, there are two crucial differences. One...
متن کاملHow consonants, dialect and speech rate affect vowel devoicing?
We examined the glottal opening pattern during devoicing environment in Japanese, with respect to the factors that facilitate or suppress devoicing. The factors include consonantal environment, dialects, speech rate, consecutive devoicing environment and phrase final position. The results indicated that glottal opening patterns are twofold: a single phaseand a double phase opening for /CVC/. On...
متن کاملFactors affecting utterance-final vowel devoicing in spontaneous Japanese
Investigation of spontaneous speech corpora has shown that vowel devoicing in Japanese is a statistical phenomena. However, factors behind vowel devoicing have not been fully studied. In addition, there have been no studies that specifically examined pre-pausal vowel devoicing. In this paper, we investigate vowel devoicing in the pre-pausal position, in particular, vowel devoicing occurring at ...
متن کامل